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Purpose 


Design 


Challenges 


NASA's mission includes furthering our understanding of biological systems through space- 
based research in order to improve life on earth and to enable the human exploration of 
space. To achieve these goals, NASA is investing in GeneLab, a multi-year effort to conduct 
biological and medical research in space, principally aboard the International Space Station 
(ISS). Through the GeneLab project, researchers will include high-throughput genomic, 
transcriptomic, proteomic or other "omics" assays as part of their experiments conducted 
on the ISS. The raw data from these assays will be stored in the GeneLab Data Systems 
(GLDS) currently being developed for the project. GeneLab intends to support "open 
science" research on the housed data sets, creating a multiplier effect on the science return 
from these experiments; the GLDS will serve all omics data without restrictions to the 
public. The system will ultimately include a biocomputation platform with collaborative 
science capabilities, to enable the discovery and validation of molecular networks that are 
influenced by space conditions. 



Experiment on ISS 


Crew performs experimental 
protocol and collects samples. 



Experiment Initiation 


GeneLab 
Experiment is 
prepared and 
launched. 


Return to Earth 

Material sent back to earth for 
processing. Controls (ground and/or 
flight) processed at the same time. 




Data Sharing 

Data shared with larger scientific 
community. Results feedback to 
GeneLab and other databases 
/ accelerating scientific discovery by 
/' leveraging a bigger community. 
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Next Generation Research 

Iterative research solicitations for 
experiments utilizing GeneLab data 
for ground validation and next 
generation flight research. 



Process Samples 

Extracted DNA, RNA and/or protein 
sent to omics center to generate 
sequence, transcript or protein 
expression data 




Data returned to GeneLab for 
analysis. Raw data uploaded into 
GeneLab Data Systems for pubic 
viewing and analysis 



Modeling and Validation 

Integrated analysis of raw data for 
cross-species systems biological 
model development 


http://genelab.nasa.gov/ 


NASA has chosen a phased capability implementation for the GLDS. The initial phase of the 
GLDS development effort emphasizes capabilities for the submission, curation, search, and 
retrieval of omics data sets. Important design considerations included: 

• leveraging existing systems and systems components to deploy Phase 1 
capabilities expediently; 

• determination of optimal data set curation procedures including metadata 
representation and generation and quality control procedures); and 

• balancing GLDS accessibility with user engagement and usage tracking 
capabilities. 

Phase 2 will focus on interoperability and supporting federation of GLDS-housed data sets 
with externally-curated data in data set search, retrieval, and annotation functions. 

In Phase 3 we will develop a platform for computational biologists to execute and 
collaboratively develop analysis tools, integrate results of analyses with those of 
collaborating researchers, annotate data sets with interpretations, and share insights and 
hypotheses, building a knowledge base relevant to space biology and medicine. 


Phased Implementation 2014-2021 



Phase 1 Phase 2 Phase 3 Phase 4 

Searchable Data Data Acquisition System Integration Implementation 

FY2014 -2015 FY2015-2016 FY2017-2018 FY2019-2021 


IT Systems 

• System Requirements & 
Architecture 

• Public Website 

• Searchable Data Repository 

• Requirements level 1 
Science 

• Omics Center Selection 

• Protocol Development 

• Data analysis validation 

• Initiate ground controls 
Collaborate with two 
manifested flight experiments 

• SDT Solicitation for Dedicated 



IT Systems 

• Link to Public Databases 

• Beta Space 
bioinformatics system 

Science 

• Omics Center 
Selection 

• Data analysis from 
initial ground studies 

• Science Definition 
Teams Identified 

• Outreach Program 
Plan 


IT Systems 

• Integrated Platform across 
model organisms 

• Build Community via 
collaborative science 

Science 

• Continue ground controls and 
process enhancement 

• Engage with Scientists 
external to NASA as part of 
Outreach Program 

• Dedicated flight experiments 


Full science community 
engagement 

Development of analytical and 
Modeling tools 
Ongoing dedicated flight 
experiments 
Website and platform 
sustaining activities 
Continuous improvement 


The GeneLab project and the GLDS face several challenges, including: 


• Numerous and rapidly 
evolving standards for 
omics data, metadata, 
protocols and analyses 


CIMR 


MIBBI 


MIAPE 
MINSEQE 






isacommons 




• Representing the 
unique aspects of space 
biology experiments as 
metadata and related 
data 




• Coalescing access to a wide range of biocomputation tools, overcoming lack of interoperability 

• Incentivizing engagement by the space biology/medicine research community 


isacreator 

configurator 



linkedisa 

isatordf 



Conclusions 




Experiment Samples 
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GeneLab Repository 



Principal 

Investigator 


AICGCGIX. 


ATCGCGTT.... 

GGCGATTA... 



External Omics Repositories 
& Knowledge Bases 



GeneLab Platform 


Integrative Analysis 
(x-species, x-system, 
multi-assay) 



Analysis Workspace 


1 



High-throughput systems for analyzing genomes and transcriptomes are generating vast amounts of 
biological data. Integrated analyses of these data with proteomic, metabolomic, physiologic, and 
phenotypic information holds the promise of rapid elucidation of complex molecular pathways, and 
of huge advancements in biological understanding and space medicine. Data Systems like the 
nascent GLDS are challenged with the representation, organization and integration of these highly 
complex data sets, and with providing researchers with the tools and environments they require for 
maximum utilization of the data in their analyses. 












